Adaptive Hierarchy-Branch Fusion for Online Knowledge Distillation

نویسندگان

چکیده

Online Knowledge Distillation (OKD) is designed to alleviate the dilemma that high-capacity pre-trained teacher model not available. However, existing methods mostly focus on improving ensemble prediction accuracy from multiple students (a.k.a. branches), which often overlook homogenization problem makes student saturate quickly and hurts performance. We assume intrinsic bottleneck of comes identical branch architecture coarse strategy. propose a novel Adaptive Hierarchy-Branch Fusion framework for Distillation, termed AHBF-OKD, designs hierarchical branches adaptive hierarchy-branch fusion module boost diversity aggregate complementary knowledge. Specifically, we first introduce architectures construct diverse peers by increasing depth monotonously basis target branch. To effectively transfer knowledge most complex simplest branch, an create assistants recursively, regards as smallest assistant. During training, assistant previous hierarchy explicitly distilled current hierarchy. Thus, important scores different are adaptively allocated reduce homogenization. Extensive experiments demonstrate effectiveness AHBF-OKD datasets, including CIFAR-10/100 ImageNet 2012. For example, 2012, ResNet-18 achieves Top-1 error 29.28\%, significantly outperforms state-of-the-art methods. The source code available at https://github.com/linruigong965/AHBF.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effective Online Knowledge Graph Fusion

Recently, Web search engines have empowered their search with knowledge graphs to satisfy increasing demands of complex information needs about entities. Each engine offers an online knowledge graph service to display highly relevant information about the query entity in form of a structured summary called knowledge card. The cards from different engines might be complementary. Therefore, it is...

متن کامل

A knowledge hierarchy model for adaptive multi-agent systems

In a changing world the need for adaptivity in software systems is apparent more than ever, since the cost of software maintenance is a huge burden. Adaptivity is needed since business processes, business rules and business terms constantly evolve. This paper argues for a radical solution making use of the inherent adaptivity of software agents. The Adaptive Agent Model (AAM) is described in te...

متن کامل

Online adaptive hidden Markov model for multi-tracker fusion

In this paper, we propose a novel method for visual tracking called HMMTxD. The method fuses information from complementary trackers and a detector by utilizing a hidden Markov model whose latent states correspond to a binary vector expressing the failure of individual trackers. The Markov model is trained in an unsupervised way, relying on an online learned detector to provide a source of trac...

متن کامل

Sequence-Level Knowledge Distillation

Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neura...

متن کامل

Knowledge Distillation for Bilingual Dictionary Induction

Leveraging zero-shot learning to learn mapping functions between vector spaces of different languages is a promising approach to bilingual dictionary induction. However, methods using this approach have not yet achieved high accuracy on the task. In this paper, we propose a bridging approach, where our main contribution is a knowledge distillation training objective. As teachers, rich resource ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i6.25937